Classifiers and their Metrics Quantified

نویسنده

  • J B Brown
چکیده

Molecular modeling frequently constructs classification models for the prediction of two-class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rates. However, these frequently used metrics applied to retrospective and/or artificially generated prediction datasets can potentially overestimate true performance in actual prospective experiments. Here, we systematically consider metric value surface generation as a consequence of data balance, and propose the computation of an inverse cumulative distribution function taken over a metric surface. The proposed distribution analysis can aid in the selection of metrics when formulating study design. In addition to theoretical analyses, a practical example in chemogenomic virtual screening highlights the care required in metric selection and interpretation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Classifiers in Software Fault-Proneness Prediction

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

متن کامل

Competence-conscious associative classification

The classification performance of an associative classifier is strongly dependent on the statistic measure or metric that is used to quantify the strength of the association between features and classes (i.e., confidence, correlation etc.). Previous studies have shown that classifiers produced by different metrics may provide conflicting predictions, and that the best metric to use is data-depe...

متن کامل

Compositional Mechanisms of Japanese Numeral Classifiers

This paper suggests that Generative Lexicon Theory (Pustejovsky, 1995, 2006, 2011) offers a new analysis of numeral classifiers, focusing on Japanese having various kinds of classifiers. It is often said that classifiers agree with quantified nouns, that is, the nouns have to match the semantic requirements of the classifiers. This paper examines their lexical structures and compositional mecha...

متن کامل

The Metric Dilemma: Competence-Conscious Associative Classification

The classification performance of an associative classifier is strongly dependent on the statistic measure or metric that is used to quantify the strength of the association between features and classes (i.e., confidence, correlation etc.). Previous studies have shown that classifiers produced by different metrics may provide conflicting predictions, and that the best metric to use is data-depe...

متن کامل

استفاده از بعد فراکتالی برای بررسی اثر مقیاس بر حساسیت سنجه‌های سیمای سرزمین

The sensitivity of landscape metrics to the scale effect is one of the most challenging issues in landscape ecology and quantification of land use spatial patterns. In this study, fractal dimension was employed to assess the effect of scale on the sensitivity of landscape metric in the north of Iran (around Sari) as the case study. Land use/ cover maps were derived from Landsat-8 (OLI sensor) i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2018